Automated Text Summarization
نویسنده
چکیده
After lying dormant for over two decades, automated text summarization has experienced a tremendous resurgence of interest in the past few years. Research is being conducted in China, Europe, Japan, and North America, and industry has brought to market more than 30 summarization systems; most recently, a series of large-scale text summarization evaluations, Document Understanding Conference (DUC) and Text Summarization Challenge (TSC) have been held yearly in the United States and Japan. In this tutorial, we will review the state of the art in automatic summarization, and will discuss and critically evaluate current approaches to the problem. We will first outline the major types of summary: indicative vs. informative; abstract vs. extract; generic vs. query-oriented; background vs. just-the-news; single-document vs. multidocument; and so on. We will describe the typical decomposition of summarization into three stages, and explain in detail the major approaches to each stage. For topic identification, we will outline techniques based on stereotypical text structure, cue words, highfrequency indicator phrases, intratext connectivity, and discourse structure centrality. For topic fusion, we will outline some ideas that have been proposed, including concept generalization and semantic association. For summary generation, we will describe the problems of sentence planning to achieve information compaction. How good is a summary? Evaluation is a difficult issue. We will describe various suggested measures and discuss the adequacy of current evaluation methods including manual evaluation procedures used in DUC, the factoid and pyramid method reference summary creation procedures and fully automatic evaluation method such as ROUGE. The recently developed automatic evaluation method based on basic element (BE) will also be covered. Throughout, we will highlight the strengths and weaknesses of statistical and symbolic/linguistic techniques in implementing efficient summarization systems. We will discuss ways in which summarization systems can interact with and/or complement natural language generation, discourse parsing, information extraction, and information retrieval systems. Finally, we will present a set of open problems that we perceive as being crucial for immediate progress in automatic summarization.
منابع مشابه
A survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملText Summarization Using Cuckoo Search Optimization Algorithm
Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...
متن کاملSummarization of Diagrams in Documents
Documents are composed of text and graphics. There is substantial work on automated text summarization but almost none on the automated summarization of graphics. Four examples of diagrams from the scientific literature are used to indicate the problems and possible solutions: a table of images, a flow chart, a set of x,y data plots, and a block diagram. Manual summaries are first constructed. ...
متن کاملEXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS
Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...
متن کاملAutomated Text Summarization In SUMMARIST
SUMMARIST is an attempt to create a robust automated text summanzaUon system, based on the 'equation' summarization = topw Ment:ficatwn + mterpretatwn + generatwn We descnbe the system's arclutecture and provide detmls of some of its modules
متن کاملSystematic literature review of fuzzy logic based text summarization
Information Overloadrq is not a new term but with the massive development in technology which enables anytime, anywhere, easy and unlimited access; participation & publishing of information has consequently escalated its impact. Assisting userslq informational searches with reduced reading surfing time by extracting and evaluating accurate, authentic & relevant information are the primary c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005